Record replay audio#252
Conversation
|
Firetiger deploy monitoring skipped This PR didn't match the auto-monitor filter configured on your GitHub connection:
Reason: PR modifies chromium launcher, wrapper, recorder, and config packages rather than the kernel API endpoints (packages/api/cmd/api/) or Temporal workflows (packages/api/lib/temporal) specified in the filter. To monitor this PR anyway, reply with |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Recording cap shorter than script
- I made the recording duration configurable per test and increased the Zombocom archive scenario cap to 70 seconds so recording no longer ends before the Playwright flow completes.
Or push these changes by commenting:
@cursor push 2f6a9b26ef
Preview (2f6a9b26ef)
diff --git a/server/e2e/e2e_recording_audio_test.go b/server/e2e/e2e_recording_audio_test.go
--- a/server/e2e/e2e_recording_audio_test.go
+++ b/server/e2e/e2e_recording_audio_test.go
@@ -52,7 +52,7 @@
return await page.title();
`, audioSite.ContainerURL())
- recordReplayAudio(t, ctx, c, playwrightCode, os.Getenv("RECORDING_AUDIO_OUTPUT_PATH"), 0.1)
+ recordReplayAudio(t, ctx, c, playwrightCode, os.Getenv("RECORDING_AUDIO_OUTPUT_PATH"), 35, 0.1)
}
func TestReplayRecordingZombocomArchiveAudio(t *testing.T) {
@@ -132,16 +132,15 @@
return playback;
`
- recordReplayAudio(t, ctx, c, playwrightCode, outputPath, 0.01)
+ recordReplayAudio(t, ctx, c, playwrightCode, outputPath, 70, 0.01)
}
-func recordReplayAudio(t *testing.T, ctx context.Context, c *TestContainer, playwrightCode string, outputPath string, minPeakLevel float64) {
+func recordReplayAudio(t *testing.T, ctx context.Context, c *TestContainer, playwrightCode string, outputPath string, maxDuration int, minPeakLevel float64) {
t.Helper()
client, err := c.APIClient()
require.NoError(t, err, "failed to create API client")
- maxDuration := 35
maxFileSize := 100
startResp, err := client.StartRecordingWithResponse(ctx, instanceoapi.StartRecordingJSONRequestBody{
MaxDurationInSeconds: &maxDuration,You can send follow-ups to the cloud agent here.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
There are 3 total unresolved issues (including 1 from previous review).
Autofix Details
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: Pulse socket before sink ready
- I load
module-null-sinkbeforemodule-native-protocol-unixso the Pulse socket appears only afterKernelOutputis created.
- I load
- ✅ Fixed: Keepalive exit kills PulseAudio
- I replaced
wait -nwith a keepalive retry loop and wait only on PulseAudio so apacatexit no longer terminates the daemon.
- I replaced
Or push these changes by commenting:
@cursor push 8197b93a61
Preview (8197b93a61)
diff --git a/shared/start-pulseaudio.sh b/shared/start-pulseaudio.sh
--- a/shared/start-pulseaudio.sh
+++ b/shared/start-pulseaudio.sh
@@ -22,16 +22,12 @@
--daemonize=no \
--log-target=stderr \
--exit-idle-time=-1 \
- --load="module-native-protocol-unix socket=/tmp/pulse/native auth-anonymous=1" \
- --load="module-null-sink sink_name=KernelOutput rate=48000 channels=2 sink_properties=device.description=KernelOutput" &
+ --load="module-null-sink sink_name=KernelOutput rate=48000 channels=2 sink_properties=device.description=KernelOutput" \
+ --load="module-native-protocol-unix socket=/tmp/pulse/native auth-anonymous=1" &
pulse_pid=$!
- keepalive_pid=""
cleanup() {
- if [ -n "$keepalive_pid" ]; then
- kill "$keepalive_pid" 2>/dev/null || true
- fi
kill "$pulse_pid" 2>/dev/null || true
wait 2>/dev/null || true
}
@@ -44,10 +40,12 @@
sleep 0.1
done
- (
- pacat --raw --rate=48000 --channels=2 --format=s16le --device=KernelOutput /dev/zero
- ) &
- keepalive_pid=$!
+ while kill -0 "$pulse_pid" 2>/dev/null; do
+ pacat --raw --rate=48000 --channels=2 --format=s16le --device=KernelOutput /dev/zero || true
+ if kill -0 "$pulse_pid" 2>/dev/null; then
+ sleep 0.1
+ fi
+ done
- wait -n "$pulse_pid" "$keepalive_pid"
+ wait "$pulse_pid"
'You can send follow-ups to the cloud agent here.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 4 total unresolved issues (including 3 from previous reviews).
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Darwin audio path uses incompatible PulseAudio source name
- Darwin now validates audio sources as numeric AVFoundation device indices or "none" in both parameter validation and ffmpeg argument construction, preventing invalid PulseAudio names from reaching ffmpeg.
Or push these changes by commenting:
@cursor push 159fdd82c8
Preview (159fdd82c8)
diff --git a/server/lib/recorder/ffmpeg.go b/server/lib/recorder/ffmpeg.go
--- a/server/lib/recorder/ffmpeg.go
+++ b/server/lib/recorder/ffmpeg.go
@@ -89,8 +89,16 @@
if p.MaxDurationInSeconds != nil && *p.MaxDurationInSeconds <= 0 {
return fmt.Errorf("max duration must be greater than 0 seconds")
}
- if p.recordAudio() && strings.TrimSpace(p.audioSource()) == "" {
- return fmt.Errorf("audio source is required when recording audio")
+ if p.recordAudio() {
+ audioSource := strings.TrimSpace(p.audioSource())
+ if audioSource == "" {
+ return fmt.Errorf("audio source is required when recording audio")
+ }
+ if runtime.GOOS == "darwin" && audioSource != "none" {
+ if _, err := strconv.Atoi(audioSource); err != nil {
+ return fmt.Errorf("audio source must be a numeric device index or \"none\" on darwin")
+ }
+ }
}
return nil
@@ -511,10 +519,15 @@
case "darwin":
audioDevice := "none"
if recordAudio {
- audioDevice = params.audioSource()
- if strings.TrimSpace(audioDevice) == "" {
+ audioDevice = strings.TrimSpace(params.audioSource())
+ if audioDevice == "" {
return nil, fmt.Errorf("audio source is required when recording audio")
}
+ if audioDevice != "none" {
+ if _, err := strconv.Atoi(audioDevice); err != nil {
+ return nil, fmt.Errorf("audio source must be a numeric device index or \"none\" on darwin")
+ }
+ }
}
args = []string{
// Input options for AVFoundationYou can send follow-ups to the cloud agent here.
Add a standalone module-null-source (KernelInput) so the browser sees a real, non-monitor microphone -- Chromium excludes monitor sources from enumerateDevices(), so the null-sink monitor alone left zero audioinput devices. Point neko's audio capture at KernelOutput.monitor so live view streams the browser audio. Neko's default device (audio_output.monitor) does not exist here, which is why live view had no audio. Co-authored-by: Cursor <cursoragent@cursor.com>
ce8e4a2 to
bfb3453
Compare
Extend TestReplayRecordingIncludesAudioTrack to enumerate media devices over a raw CDP websocket and require both an audiooutput and a non-monitor audioinput. Chromium drops PulseAudio monitor sources from the input list, so this guards the KernelInput null-source against regressing back to a monitor-only setup that antibot scripts flag. Co-authored-by: Cursor <cursoragent@cursor.com>
module-null-source rejects source_properties in this PulseAudio version, so the fake mic silently failed to load and the browser saw zero audioinput devices. Drop the unsupported arg (the source keeps the default description). In the e2e check, navigate to a file:// page (a secure context) before calling enumerateDevices, since navigator.mediaDevices is undefined on about:blank, and assert a non-monitor audioinput is present rather than matching a specific label. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix prepared fixes for both issues found in the latest run.
- ✅ Fixed: Pulse ready before sink exists
- The wrapper now waits for the configured Pulse sink to appear via pactl after the socket is ready before launching Chromium, removing the early-start race.
- ✅ Fixed: Encoder tuning always on
- Linux-specific low-latency x264 tuning is now applied only when audio recording is enabled and video-only Linux recordings regain -use_wallclock_as_timestamps.
Or push these changes by commenting:
@cursor push 0076192699
Preview (0076192699)
diff --git a/server/cmd/wrapper/main.go b/server/cmd/wrapper/main.go
--- a/server/cmd/wrapper/main.go
+++ b/server/cmd/wrapper/main.go
@@ -188,6 +188,7 @@
startAll("mutter")
}
waitForSocket(pulseSocket, 10*time.Second)
+ waitForPulseSink(os.Getenv("PULSE_SINK"), 10*time.Second)
startAll("chromium")
waitForSocket(dbusSocket, 10*time.Second)
if prof == profileHeadful && webrtc {
@@ -332,3 +333,21 @@
logf(format, args...)
os.Exit(1)
}
+
+func waitForPulseSink(sinkName string, timeout time.Duration) {
+ sinkName = strings.TrimSpace(sinkName)
+ if sinkName == "" {
+ logf("WARNING: pulse sink name is empty; skipping sink wait")
+ return
+ }
+
+ deadline := time.Now().Add(timeout)
+ for time.Now().Before(deadline) {
+ out, err := exec.Command("pactl", "-s", "unix:"+pulseSocket, "list", "short", "sinks").Output()
+ if err == nil && strings.Contains(string(out), "\t"+sinkName+"\t") {
+ return
+ }
+ time.Sleep(50 * time.Millisecond)
+ }
+ logf("WARNING: pulse sink %s not ready after %s", sinkName, timeout)
+}
diff --git a/server/lib/recorder/ffmeg_test.go b/server/lib/recorder/ffmeg_test.go
--- a/server/lib/recorder/ffmeg_test.go
+++ b/server/lib/recorder/ffmeg_test.go
@@ -2,6 +2,7 @@
import (
"path/filepath"
+ "runtime"
"testing"
"time"
@@ -113,6 +114,20 @@
assert.NotContains(t, args, "aresample=async=1:first_pts=0")
}
+func TestFFmpegArgs_VideoOnlyLinuxUsesWallclockTimestamps(t *testing.T) {
+ if runtime.GOOS != "linux" {
+ t.Skip("linux-only ffmpeg flag behavior")
+ }
+
+ tempDir := t.TempDir()
+ args, err := ffmpegArgs(defaultParams(tempDir), filepath.Join(tempDir, "out.mp4"))
+ require.NoError(t, err)
+
+ assert.Contains(t, args, "-use_wallclock_as_timestamps")
+ assert.NotContains(t, args, "veryfast")
+ assert.NotContains(t, args, "zerolatency")
+}
+
func TestFFmpegRecorder_ForceStop(t *testing.T) {
tempDir := t.TempDir()
rec := &FFmpegRecorder{
diff --git a/server/lib/recorder/ffmpeg.go b/server/lib/recorder/ffmpeg.go
--- a/server/lib/recorder/ffmpeg.go
+++ b/server/lib/recorder/ffmpeg.go
@@ -563,11 +563,15 @@
// Video encoding
"-c:v", "libx264",
- "-preset", "veryfast",
- "-tune", "zerolatency",
"-profile:v", "high", // Explicit web-compatible profile
"-pix_fmt", "yuv420p", // Web-standard pixel format
}...)
+ if runtime.GOOS == "linux" && recordAudio {
+ args = append(args,
+ "-preset", "veryfast",
+ "-tune", "zerolatency",
+ )
+ }
if recordAudio {
args = append(args, []string{
@@ -577,6 +581,9 @@
"-ac", "2",
}...)
}
+ if runtime.GOOS == "linux" && !recordAudio {
+ args = append(args, "-use_wallclock_as_timestamps", "1")
+ }
args = append(args, []string{
// Timestamp handling for reliable playbackYou can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit 114f47b. Configure here.
| startAll("mutter") | ||
| } | ||
| waitForSocket(pulseSocket, 10*time.Second) | ||
| startAll("chromium") |
There was a problem hiding this comment.
Pulse ready before sink exists
Medium Severity
Boot treats the PulseAudio Unix socket as ready, then starts Chromium, but start-pulseaudio.sh only creates the KernelOutput sink and keepalive stream after the socket exists. Chromium and replay capture can come up in a short window where Pulse is listening but browser playback and monitor capture are not fully wired.
Reviewed by Cursor Bugbot for commit 114f47b. Configure here.
| // Timestamp handling for reliable playback | ||
| "-use_wallclock_as_timestamps", "1", // Use system time instead of input stream time | ||
| "-reset_timestamps", "1", // Reset timestamps to start from zero | ||
| "-reset_timestamps", "1", |
There was a problem hiding this comment.
Encoder tuning always on
Low Severity
-preset veryfast, -tune zerolatency, and removal of -use_wallclock_as_timestamps apply to every Linux recording, not only when RecordAudio is enabled. Video-only replays inherit lower-quality, real-time-oriented encoding and different timestamp behavior than before this change.
Reviewed by Cursor Bugbot for commit 114f47b. Configure here.



Summary
Tests
go test ./lib/recorder -count=1DOCKER_BUILDKIT=1 docker build -f images/chromium-headful/Dockerfile -t onkernel/chromium-headful-test:latest .TESTCONTAINERS_RYUK_DISABLED=true TESTCONTAINERS_HOST_OVERRIDE=127.0.0.1 RECORDING_ZOMBO_OUTPUT_PATH=/home/agent/repos/kernel-images/server/e2e/testdata/replay-audio-zombocom.mp4 go test ./e2e -run 'TestReplayRecording(IncludesAudioTrack|ZombocomArchiveAudio)' -count=1 -vNote
Medium Risk
Touches container boot order, Pulse/Chromium audio routing, and ffmpeg encoding defaults; misconfiguration could break startup or produce silent/misaligned recordings, but behavior is env-gated and heavily tested.
Overview
Adds optional audio to ffmpeg replay MP4s by capturing the PulseAudio monitor of the browser playback sink (
KernelOutput.monitor), withRECORD_AUDIO/AUDIO_SOURCEon the API and matching supervisor env forkernel-images-api.Container audio stack: Replaces the old headful-only Pulse script with
shared/start-pulseaudio.sh(null sink +KernelInputmic + silent keepalive on the monitor), starts Pulse on the hot path before Chromium in both headful and headless images, wiresPULSE_*into Chromium, drops--mute-audioon headless, and points Neko live capture atKernelOutput.monitor. Supervisord defaults enable recording audio in images while the Go API still defaultsRECORD_AUDIOto false unless overridden.Recorder: When audio is on, ffmpeg adds a Pulse input, AAC encode, explicit stream maps,
veryfast/zerolatencyx264, and drops wallclock timestamp forcing to reduce A/V drift.Tests / tooling: New docker e2e tests assert non-silent audio in downloads, device enumeration via CDP, and an optional Zombocom archive case; unit tests cover pulse ffmpeg args; a large
verify_replay_pipeline.mjsscript compares replays to golden video/audio.Reviewed by Cursor Bugbot for commit 114f47b. Bugbot is set up for automated code reviews on this repo. Configure here.